Back to Blog

Haiku 2.3 Is Here And We Have Come So Far

Haiku 2.3 is here. This model has improved a lot compared to the old ones. A lot. Like, actually a lot. We have come so far from the days of pipe characters and chuamliamce.

When you look back at your old models and cringe, you know you have grown. When you look at your new models and smile, you know you have succeeded. I am smiling.

The Journey

Let me show you the journey. From Haiku-1 to Haiku-2.3. From outputting repeated words to forming actual sentences. From coherence scores of 1.99 to 6.03. From combined scores of 1.62 to 4.84.

1.62
Haiku-1 Combined Score
4.84
Haiku-2.3 Combined Score
3x
Improvement
My Pride

I ran a benchmark. I tested all five versions side by side by side by side. Seven questions each. Same prompts. Same conditions. Here is what happened.

The Benchmark Results

Model Fluency Coherence Relevance Format Combined
TMLM-Haiku-1 0.50 1.99 1.22 3.29 1.62
TMLM-Haiku-1.3 1.69 1.56 0.00 3.29 1.21
TMLM-Haiku-2 8.35 5.72 0.00 3.29 3.87
haiku (sft) 8.80 5.67 2.61 3.29 4.83
haiku_spin (sft) 8.78 6.03 2.25 3.29 4.84 ★

Look at that progression. Haiku-1 had a combined score of 1.62. Haiku-2.3 (the SPIN version) has 4.84. That is three times better. Three times.

Fluency went from 0.50 to 8.78. Coherence went from 1.99 to 6.03. Relevance went from 1.22 to 2.25. Every metric improved. Every single one.

What I Am Releasing

I am releasing three versions:

  • Instruct - The chat/instruct tuned version for conversational use
  • Pretrain - The base pretrained model for further fine-tuning
  • SPIN - The SPIN-trained version with best coherence and overall performance

Each version serves a different purpose. Use instruct for chatting. Use pretrain for your own experiments. Use SPIN for the best overall performance.

Example Outputs

Here are some examples from the benchmark. Look at how far we have come:

# Haiku-1 on "Explain AI"
Output: couldcouldoldbloodbloodbloodbodybodybodybeyonddonedevelopment...
# Repeated words. No coherence. Pure chaos.

# Haiku-2.3 (SPIN) on "Explain AI"
Output: The artificial intelligence is a problem where archaely receives a specific speed of spent records or personnel speakers. It is...
# Actual sentences. Actual structure. Actual progress.

The difference is night and day. The difference is three years of progress compressed into one model. The difference is worth celebrating.

Where To Get Them

https://huggingface.co/collections/CompactAI-O

All models are on HuggingFace. All versions are available. All are free. All are yours to use.

Download them. Use them. Break them. Tell me how they work. Tell me how they fail. Tell me what you build with them.

Final Thoughts

We have come so far. From Haiku-1 to Haiku-2.3. From 1.62 to 4.84. From chaos to coherence. From nothing to something.

This is not the end. This is not even the beginning of the end. But it might be the end of the beginning. The next version will be better. The next version will be stronger. The next version will make this one look primitive.

But for now, I am proud. For now, I am happy. For now, I am celebrating three times improvement and six point zero three coherence.

Thank you for following this journey. Thank you for believing in tiny models. Thank you for waiting. We have come so far. And we are just getting started.